error variance
Identifiability of the minimum-trace directed acyclic graph and hill climbing algorithms without strict local optima under weakly increasing error variances
Chang, Hyunwoong, Kim, Jaehoan
We prove that the true underlying directed acyclic graph (DAG) in Gaussian linear structural equation models is identifiable as the minimum-trace DAG when the error variances are weakly increasing with respect to the true causal ordering. This result bridges two existing frameworks as it extends the identifiable cases within the minimum-trace DAG method and provides a principled interpretation of the algorithmic ordering search approach, revealing that its objective is actually to minimize the total residual sum of squares. On the computational side, we prove that the hill climbing algorithm with a random-to-random (R2R) neighborhood does not admit any strict local optima. Under standard settings, we confirm the result through extensive simulations, observing only a few weak local optima. Interestingly, algorithms using other neighborhoods of equal size exhibit suboptimal behavior, having strict local optima and a substantial number of weak local optima.
Causal discovery in deterministic discrete LTI-DAE systems
Konkathi, Bala Rajesh, Tangirala, Arun K.
Discovering pure causes or driver variables in deterministic LTI systems is of vital importance in the data-driven reconstruction of causal networks. A recent work by Kathari and Tangirala, proposed in 2022, formulated the causal discovery method as a constraint identification problem. The constraints are identified using a dynamic iterative PCA (DIPCA)-based approach for dynamical systems corrupted with Gaussian measurement errors. The DIPCA-based method works efficiently for dynamical systems devoid of any algebraic relations. However, several dynamical systems operate under feedback control and/or are coupled with conservation laws, leading to differential-algebraic (DAE) or mixed causal systems. In this work, a method, namely the partition of variables (PoV), for causal discovery in LTI-DAE systems is proposed. This method is superior to the method that was presented by Kathari and Tangirala (2022), as PoV also works for pure dynamical systems, which are devoid of algebraic equations. The proposed method identifies the causal drivers up to a minimal subset. PoV deploys DIPCA to first determine the number of algebraic relations ($n_a$), the number of dynamical relations ($n_d$) and the constraint matrix. Subsequently, the subsets are identified through an admissible partitioning of the constraint matrix by finding the condition number of it. Case studies are presented to demonstrate the effectiveness of the proposed method.
Using Parametric PINNs for Predicting Internal and External Turbulent Flows
Ghosh, Shinjan, Chakraborty, Amit, Brikis, Georgia Olympia, Dey, Biswadip
Computational fluid dynamics (CFD) solvers employing two-equation eddy viscosity models are the industry standard for simulating turbulent flows using the Reynolds-averaged Navier-Stokes (RANS) formulation. While these methods are computationally less expensive than direct numerical simulations, they can still incur significant computational costs to achieve the desired accuracy. In this context, physics-informed neural networks (PINNs) offer a promising approach for developing parametric surrogate models that leverage both existing, but limited CFD solutions and the governing differential equations to predict simulation outcomes in a computationally efficient, differentiable, and near real-time manner. In this work, we build upon the previously proposed RANS-PINN framework, which only focused on predicting flow over a cylinder. To investigate the efficacy of RANS-PINN as a viable approach to building parametric surrogate models, we investigate its accuracy in predicting relevant turbulent flow variables for both internal and external flows. To ensure training convergence with a more complex loss function, we adopt a novel sampling approach that exploits the domain geometry to ensure a proper balance among the contributions from various regions within the solution domain. The effectiveness of this framework is then demonstrated for two scenarios that represent a broad class of internal and external flow problems.
Online waveform selection for cognitive radar
Tholeti, Thulasi, Rangarajan, Avinash, Kalyani, Sheetal
Designing a cognitive radar system capable of adapting its parameters is challenging, particularly when tasked with tracking a ballistic missile throughout its entire flight. In this work, we focus on proposing adaptive algorithms that select waveform parameters in an online fashion. Our novelty lies in formulating the learning problem using domain knowledge derived from the characteristics of ballistic trajectories. We propose three reinforcement learning algorithms: bandwidth scaling, Q-learning, and Q-learning lookahead. These algorithms dynamically choose the bandwidth for each transmission based on received feedback. Through experiments on synthetically generated ballistic trajectories, we demonstrate that our proposed algorithms achieve the dual objectives of minimizing range error and maintaining continuous tracking without losing the target.
Generalized Criterion for Identifiability of Additive Noise Models Using Majorization
The discovery of causal relationships from observational data is very challenging. Many recent approaches rely on complexity or uncertainty concepts to impose constraints on probability distributions, aiming to identify specific classes of directed acyclic graph (DAG) models. In this paper, we introduce a novel identifiability criterion for DAGs that places constraints on the conditional variances of additive noise models. We demonstrate that this criterion extends and generalizes existing identifiability criteria in the literature that employ (conditional) variances as measures of uncertainty in (conditional) distributions. For linear Structural Equation Models, we present a new algorithm that leverages the concept of weak majorization applied to the diagonal elements of the Cholesky factor of the covariance matrix to learn a topological ordering of variables. Through extensive simulations and the analysis of bank connectivity data, we provide evidence of the effectiveness of our approach in successfully recovering DAGs. The code for reproducing the results in this paper is available in Supplementary Materials.
Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization
Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments. The original Q-learning suffers from performance and complexity challenges across very large networks. Herein, a novel model-free ensemble reinforcement learning algorithm which adapts the classical Q-learning is proposed to handle these challenges for networks which admit Markov decision process (MDP) models. Multiple Q-learning algorithms are run on multiple, distinct, synthetically created and structurally related Markovian environments in parallel; the outputs are fused using an adaptive weighting mechanism based on the Jensen-Shannon divergence (JSD) to obtain an approximately optimal policy with low complexity. The theoretical justification of the algorithm, including the convergence of key statistics and Q-functions are provided. Numerical results across several network models show that the proposed algorithm can achieve up to 55% less average policy error with up to 50% less runtime complexity than the state-of-the-art Q-learning algorithms. Numerical results validate assumptions made in the theoretical analysis.
Bayesian Approach to Linear Bayesian Networks
Hwang, Seyong, Lee, Kyoungjae, Oh, Sunmin, Park, Gunwoong
This study proposes the first Bayesian approach for learning high-dimensional linear Bayesian networks. The proposed approach iteratively estimates each element of the topological ordering from backward and its parent using the inverse of a partial covariance matrix. The proposed method successfully recovers the underlying structure when Bayesian regularization for the inverse covariance matrix with unequal shrinkage is applied. Specifically, it shows that the number of samples $n = \Omega( d_M^2 \log p)$ and $n = \Omega(d_M^2 p^{2/m})$ are sufficient for the proposed algorithm to learn linear Bayesian networks with sub-Gaussian and 4m-th bounded-moment error distributions, respectively, where $p$ is the number of nodes and $d_M$ is the maximum degree of the moralized graph. The theoretical findings are supported by extensive simulation studies including real data analysis. Furthermore the proposed method is demonstrated to outperform state-of-the-art frequentist approaches, such as the BHLSM, LISTEN, and TD algorithms in synthetic data.
To Compute or not to Compute? Adaptive Smart Sensing in Resource-Constrained Edge Computing
Ballotta, Luca, Peserico, Giovanni, Zanini, Francesco, Dini, Paolo
We consider a network of smart sensors for an edge computing application that sample a time-varying signal and send updates to a base station for remote global monitoring. Sensors are equipped with sensing and compute, and can either send raw data or process them on-board before transmission. Limited hardware resources at the edge generate a fundamental latency-accuracy trade-off: raw measurements are inaccurate but timely, whereas accurate processed updates are available after processing delay. Hence, one needs to decide when sensors should transmit raw measurements or rely on local processing to maximize network monitoring performance. To tackle this sensing design problem, we model an estimation-theoretic optimization framework that embeds both computation and communication latency, and propose a Reinforcement Learning-based approach that dynamically allocates computational resources at each sensor. Effectiveness of our proposed approach is validated through numerical experiments motivated by smart sensing for the Internet of Drones and self-driving vehicles. In particular, we show that, under constrained computation at the base station, monitoring performance can be further improved by an online sensor selection.
Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data
Samorodnitsky, Sarah, Wendt, Chris H., Lock, Eric F.
Understanding of the pathophysiology of obstructive lung disease (OLD) is limited by available methods to examine the relationship between multi-omic molecular phenomena and clinical outcomes. Integrative factorization methods for multi-omic data can reveal latent patterns of variation describing important biological signal. However, most methods do not provide a framework for inference on the estimated factorization, simultaneously predict important disease phenotypes or clinical outcomes, nor accommodate multiple imputation. To address these gaps, we propose Bayesian Simultaneous Factorization (BSF). We use conjugate normal priors and show that the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. We then extend BSF to simultaneously predict a continuous or binary response, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation and full posterior inference for missing data, including "blockwise" missingness, and BSFP offers prediction of unobserved outcomes. We show via simulation that BSFP is competitive in recovering latent variation structure, as well as the importance of propagating uncertainty from the estimated factorization to prediction. We also study the imputation performance of BSF via simulation under missing-at-random and missing-not-at-random assumptions. Lastly, we use BSFP to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated OLD. Our analysis reveals a distinct cluster of patients with OLD driven by shared metabolomic and proteomic expression patterns, as well as multi-omic patterns related to lung function decline. Software is freely available at https://github.com/sarahsamorodnitsky/BSFP .
From Sensor to Processing Networks: Optimal Estimation with Computation and Communication Latency
Ballotta, Luca, Schenato, Luca, Carlone, Luca
This paper investigates the use of a networked system ($e.g.$, swarm of robots, smart grid, sensor network) to monitor a time-varying phenomenon of interest in the presence of communication and computation latency. Recent advances in edge computing have enabled processing to be spread across the network, hence we investigate the fundamental computation-communication trade-off, arising when a sensor has to decide whether to transmit raw data (incurring communication delay) or preprocess them (incurring computational delay) in order to compute an accurate estimate of the state of the phenomenon of interest. We propose two key contributions. First, we formalize the notion of $processing$ $network$. Contrarily to $sensor$ $and$ $communication$ $networks$, where the designer is concerned with the design of a suitable communication policy, in a processing network one can also control when and where the computation occurs in the network. The second contribution is to provide analytical results on the optimal preprocessing delay ($i.e.$, the optimal time spent on computations at each sensor) for the case with a single sensor and multiple homogeneous sensors. Numerical results substantiate our claims that accounting for computation latencies (both at sensor and estimator side) and communication delays can largely impact the estimation accuracy.